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Abstract 

We give a common description of Simon, Barabasi-Albert, II-PA and Price growth mod¬ 
els, by introducing suitable random graph processes with preferential attachment mechanisms. 
Through the II-PA model, we prove the conditions for which the asymptotic degree distribu¬ 
tion of the Barabasi-Albert model coincides with the asymptotic in-degree distribution of the 
Simon model. Furthermore, we show that when the number of vertices in the Simon model 
(with parameter a) goes to infinity, a portion of them behave as a Yule model with parame¬ 
ters {X,l3) = (1 — Q, 1), and through this relation we explain why asymptotic properties of a 
random vertex in Simon model, coincide with the asymptotic properties of a random genus in 
Yule model. As a by-product of our analysis, we prove the explicit expression of the in-degree 
distribution for the II-PA model, given without proof in m- References to traditional and 
recent applications of the these models are also discussed. 

Keywords-. Preferential attachment; Random graph growth; Discrete and continuous time 
models; Stochastic processes. 
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1 Introduction 

A large group of networks growth models can be classified as preferential attachment models. 
In the simplest preferential attachment mechanism an edge connects a newly created node to 
one of those already present in the network with a probability proportional to the number of 
their edges. 

Typically what is analyzed for these models are properties related both to the growth of 
the number of edges for each node and to the growth of the number of nodes. 

After the seminal paper by Barabasi and Albert [T], models admitting a preferential attach¬ 
ment mechanism have been successfully applied to the growth of different real world networks, 
such as, amongst others, physical, biological or social networks. The typical feature revealing 
a preferential attachment growth mechanism is the presence of power-law distributions, e.g., 
for the degree (or in-degree) of a node selected uniformly at random. 

Despite its present success, the preferential attachment paradigm is not new. In fact it 
dates back to a paper by Udny Yule [23], published in 1925 and regarding the development of 
a theory of macroevolution. Specifically the study concerned the time-continuous process of 
creation of genera and the evolution of species belonging to them. Yule proved that when time 
goes to infinity, the limit distribution of the number of species in a genus selected uniformly 
at random has a specific form and exhibits a power-law behavior in its tail. Thirty years later, 
the Nobel laureate Herbert A. Simon proposed a time-discrete preferential attachment model 
to describe the appearance of new words in a large piece of a text. Interestingly enough, 
the limit distribution of the number of occurrences of each word, when the number of words 
diverges, coincides with that of the number of species belonging to the randomly chosen genus 
in the Yule model, for a specific choice of the parameters. This fact explains the designation 
Yule-Simon distribution that is commonly assigned to that limit distribution. 
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Furthermore, it should be noticed that Barabasi-Albert model exhibits an asymptotic 
degree distribution that equals the Yule-Simon distribution in correspondence of a specific 
choice of the parameters and still presents power-law characteristics for more general choices 
of the parameters. The same happens also for other preferential attachment models. 

Yule, Simon and Barabasi-Albert models share the preferential attachment paradigm that 
seems to play an important role in the explanation of the scale-freeness of real networks. 
However, the mathematical tools classically used in their analysis are different. This makes 
difficult to understand in which sense models producing very similar asymptotic distributions 
are actually related one another. Although often remarked and heuristically justified, no 
rigorous proofs exist clarifying conditions for such result. Different researchers from different 
disciplines, for example theoretical physicists and economists asked themselves about the 
relations between Simon, Barabasi-Albert, Yule and also some other models closely related 
to these first three (sometimes confused in the literature under one of the previous names). 
Partial studies in this direction exist but there is still a lack of clarifying rigorous results that 
would avoid errors and would facilitate the extension of the models. 

The existing results refer to specific models and conditions but there is not a unitary 
approach to the problem. For instance, in [3], the authors compared the distribution of the 
number of occurrences of a different word in Simon model, when time goes to infinity, with 
the degree distribution in the Barabasi-Albert model, when the number of vertices goes to 
infinity. In m, an explanation relating the asymptotic distribution of the number of species 
in a random genus in Yule model and that of the number of different words in Simon model 
appears. More recently, following a heuristic argument, Simkin and Roychowdhury m gave 
a justification of the relation between Yule and Simon models. 

The aim of this paper is to study rigorously the relations between these three models. 
A fourth model, here named II-PA model (second preferential attachment model), will be 
discussed in order to better highlight the connections between Simon and Barabasi-Albert 
models. Also we include the Price model that predates the Barabasi-Albert model, and is in 
fact the first model using a preferential attachment rule for networks. 

The idea at the basis of our study is to make use of random graph processes theory to 
deal with all the considered discrete-time models and to include in this analysis also the 
continuous-time Yule model through the introduction of two suitable discrete-time processes 
converging to it. In this way we find a relationship between the discrete time models and the 
continuous time Yule model, which is easier to handle and extensively studied. Translating 
results from discrete models to their continuous counter-parts is usually a strong method to 
analyze asymptotic properties. Thus, Theorems 14.31 and 14.41 provide an easy tool for this. 

The random graph process approach was used by Barabasi and Albert to define their 
preferential attachment model of World Wide Web [I]. At each discrete-time step a new 
vertex is added together with m edges originating from it. The end points of these edges are 
selected with probability proportional to the current degree of the vertices in the network. 
Simulations from this model show that the proportion of vertices with degree k is Cmk 
with 7 close to 3 and > 0 independent of k. A mathematically rigorous study of this 
model was then performed by Bollobas, Riordan, Spencer and Tusnady [3] making use of 
random graph theory. The rigorous presentation of the model allowed the authors to prove 
that the proportion of vertices with degree k converges in probability to m{m -b 1)B {k, 3) as 
the number of vertices diverges, where B {x, y) is the Beta function. 

Here we reconsider all the models of interest in a random graph process framework. In 
Section [2] we introduce the necessary notations and basic definitions. Then, in Section [31 we 
present the four preferential attachment models of interest, i.e. Simon, H-PA, Price, Barabasi- 
Albert, and Yule models, through a mathematical description that makes use of the random 
graphs approach. Such a description allows us to highlight an aspect not always well under¬ 
lined: the asymptotic distributions that in some cases coincide do not always refer to the 
same quantity. For instance, the Barabasi-Albert model describes the degree of the vertices 
while H-PA considers the in-degree. In Section [3] we also discuss the historical context and the 
list of available mathematical results for each model. The proposed point of view by means 
of random graphs processes then permits us to prove the novel results presented in Section (4) 
The theorems described and proved there clarify the relations between the considered asymp¬ 
totic distributions of the different models, specifying for which choice of the parameters these 
distributions coincide and when they are not related. 


2 


In the concluding Section [S] we summarize the proved results and we illustrate with a 
diagram the cases in which the considered models are actually related. 


2 Definitions and mathematical background 

In this section we introduce some classical definitions, theorems and mathematical tools we 
will use in the rest of the paper. 

Let us define a graph G = {V,E) as an ordered pair comprising a set of vertices V with 
a set of edges or lines E which are 2-elements subsets of so i? C 1/ x V^. A graph G is 
directed if its edges are directed, i.e., if for every edge {i,j) € E, {i,j) 7 ^ otherwise G is 
called an undirected graph. 

We say that G is a random graph, if it is a graph selected according with a probability 
distribution over a set of graphs, or it is determined by a stochastic process that describes 
the random evolution of the graph in time. A stochastic process generating a random graph 
is called a random graph process. In other words, a random graph process is a family {G^)teT 
of random graphs (defined on a common probability space) where t is interpreted as time and 
T can be either countable or uncountable. 

A loop is an edge that connects a vertex to itself. The in-degree of a vertex v at time t, 
denoted by d{v,i), is the number of incoming edges (incoming connections). Similarly, the 
degree of a vertex v at time t, denoted by d{v, t), is the total number of incoming and outgoing 
edges at time t (when an edge is a loop, it is counted twice). In this paper we also use the 
term directed loop to indicate a loop that counts one to the in-degree. 

The random graphs studied in this paper are random graph processes starting at time 
t = 0 , without any edge neither vertex, growing monotonically by adding at each discrete 
time step either a new vertex or some directed edges between the vertices already present, 
according to some law P(u* — > Vj) = F{{i,j) G G*). 

We focus here on the analysis of the number of vertices with degree or in-degree k at 
time t, which we denote by Nk,t and Nk,t, respectively. In particular we are interested in the 
asymptotic degree or in-degree distribution of a random vertex, i.e., in the proportion Nk,t/Vt 
or Nk.t/Vt, as t goes to infinity, where Vt denotes the total number of vertices at time t. We 
will add an upper index to Nk,t or Nk,t, for instance IVf to indicate the process to which 
we refer, if necessary. 

Furthermore, we will make use of the following standard notation: for (deterministic) 
functions / = f{t) and g = <?(f), we write / = 0{g) if limt_>oo//<? is bounded, / ~ g if 
limt_>oo f/g = 1 , and / = o{g) if limt_»oo f/g = 0 . 

One of the methods used in the literature to study the asymptotic behavior of Nk,tlVt or 
Nk,t/Vt is to prove that these random processes concentrate around their expectations. In 
order to do this, the Azuma and Hoeffding inequality is applied, when possible (see also | 10 | . 
page 93). 

Lemma 2.1 (Azuma and Hoeffding inequality |12|L Let be a martingale with |Xs — 

Xs_i| < c for 1 < s < t and c a positive constant. Then 

F{\Xt - Xo\ >x)< exp(-a:V2c^t). (2.1) 

One of the first authors to use this approach in preferential attachment random graphs 
studies were Bollobas, et. al in [3]. Here we apply this approach to study different random 
graph processes. In Section [3] we illustrate this technique by analyzing the Simon model, 
reporting the corresponding computations for the Barabasi-Albert model. 


3 Preliminaries: Preferential attachment models 

As stressed in the introduction, a number of models that make use of “preferential attach¬ 
ment” mechanisms are present in the literature. Here we consider some of them rigorously 
introducing the corresponding random graph processes with the aim to allow a comparison 
of their features. To this aim, it helps to present the most known models using a common no¬ 
tation. We first discuss the case of discrete time preferential attachment models, specifically 
Simon and Barabasi-Albert models, and some others inspired by Simon model, the H-PA 
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model (second preferential attachment model) and Price model, which will help us to under¬ 
stand the relations between Simon and Barabasi-Albert models. Moreover, we also discuss a 
continuous time preferential attachment model, the Yule model, which is defined in terms of 
independent homogeneous linear birth processes. We rigorously prove that this model can be 
related with Simon, and hence Barabasi-Albert models. 

The Barabasi-Albert model presented in [I] omits some necessary details to be formulated 
in terms of a random graph process. Here we follow its description detailed as in Bollobas 
et. al [3] where the rules for the growth of the random graph not mentioned in [T] are given. 
Furthermore, in order to make easier the understanding of each model, we follow the same 
scheme for its presentation, eventually specifying the absence of some results when not yet 
available. 

Our scheme considers: 

1. The mathematical description of the associated graph structure and its growth law. 

2. The historical context motivating the first proposal of the model and some successive 
applications. 

3. Available results on the degree or in-degree distribution with particular reference to 
power law behavior. We collect both, theorems and simulation results. 

3.1 Simon model 

1. Mathematical description: The Simon model can be described as a random graph 
process in discrete time (G(,)t>i, so that is a directed graph which starts at time 
t — 1 with a single vertex vi and a directed loop. Then, given G^, one forms 
by either adding with probability a a new vertex Vi with a directed loop, i < t + 1, or 
adding with probability (l-o) a directed edge between the last added vertex v and Vj, 
1 < j <t, where the probability of Vj to be chosen is proportional to its in-degree, i.e., 

P(t; —> Vj) = (1 — a)d{vj,t)/t, ^ < j < t. (3.1) 

In Figure [T] we illustrate the growth law of this graph. 


ItTI i 



l l3Tll 


(a) 


{b) 


(c) 


Figure 1: Construction of (a) Begin at time 1 with one single vertex and a directed loop, (b) Suppose 

some time has passed, in this case, the picture corresponds to a a realization of the process at time t = 4. (c) Given 

form G® by either adding with probability a a new vertex vg with a directed loop, or adding a directed edge 
with probability given by 113.11 1. 

2. Historical context: In nn, Simon considered a model to describe the growth of a 
text that is being written such that a word is added at each time t > 1. Different words 
correspond to different vertices and repeated words to directed edges in the previous 
description. Simon introduced the two following conditions: For a £ (0,1), 

(a) P[(t -|- l)th word has not yet appeared at time t]= a 

(b) P[(t -b l)th word has appeared k times at time t] = (1 — a)kNk,t/t, 

where Nk,t is the number of different words that have appeared exactly k times at time 
t, or the number of vertices that have exactly k incoming edges (i.e. in-degree k) at time 
t in Ga- Thus, at time t -|- 1 either with probability a a new word appears (i.e., a new 
vertex Vi, i < t + 1, with a directed loop appears), or with probability (1 — a) the word 
is not new, and if it has appeared k times at time t, a directed edge is added. The 
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starting point of this edge is the last vertex that has appeared in , while its end point 
is selected with probability (EU that corresponds in this case to k/t. 

3. Available results: Simon was interested in getting results for the proportion of vertices 
that have exactly in-degree k, with respect to the total number of vertices Vt at time t. 
Thus, he proved asymptotic results for EAi^t/EVi as t —> oo. 

Next, we will give a brief synopsis of the computations made by Simon in m- The idea 
is to condition on what has happened until time t and compute the expected value at 
time t -\- 1. For fc = 1 it holds 

EAi,t+i =a-b (l-■^i^)EiVi,t, (3.2) 

and, for fc > 1, 


EA; 


k,t+\ 


[(fc _ l)EAfe_i,t - fcEAfe,t] -b EAfe,t. (3.3) 


Simon solved (EH) and (IH31) (see also m, pages 98-99) to get, as t —> oo. 


EAi_t ct 

~t ^ 2 - a’ 


(3.4) 


and for fc > 1, 


t 


„ r(t)r(i + ri^) 
‘-“r(t + i + ri;)’ 


(3.5) 


where F is the gamma function. 

Observe now that the number of vertices appeared until time t, Vj ~ Bin(t,a), so 
EVt = at. Hence, using this and EU and (IS31) for fc = 1, 


EAi,t 1 

EHt ^ 2-a’ 


(3.6) 


and for fc > 1, 


EVt 


1 r(fc)r(i + ^) 
i-«r(fc + i + ^) 






(3.7) 


as t —> oo, where B{x,y) is the Beta function. 

Now, let Gr and Ga denote the cr-fields generated by the appearance of directed edges 
up to time r and s respectively, r < s < t. Since E[E(Afe_t|f/s) | Gr\ = E(Afe_t|l?r), 
then, = E{Nk,t\Gs) is a martingale, such that zf™™ = Nk,t and = EA^.t. 

Furthermore, observe that at each unit of time, say s, either a new vertex appears or 
the last one added, is attaching to another existing vertex vj, j < s, but note this does 
not effect the in-degree of v ^ vj, or the probabilities these vertices will choose later, 
so it yields that < 1. Then, it is possible to use Azuma-Hoeffding’s 

inequality ED, and obtain that for every tt t ^^2 example take tt = y/lnt/t), 


Nk,t ENk,t 


t 


> et 


) < exp ^ - 


2t 


0 . 


(3.8) 


Now, using Chebyschev’s inequality, for every et > 0, such that tst —> oo as t —)■ oo. 


Vt EVt 

t t 


>et < 


ta{l — a) 


0 . 


(3.9) 


Hence, by (13.81) and (13.91) . Nk.t/t —> EA^^t/t and Vt/t —>■ EV/t in probability. 
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Finally, since ¥,Nk,t/t and EFt/t converge as t goes to infinity to the constant values (18.411 
and (I331l respectively, and because Vt is a random variable with binomial distribution, 
Bin(t, a), then by properties of convergence in probability we obtain that 


Nk,t 

Vt 


1 r(fc)r(i + ^) 
i-«r(fc + i + ^) 



B{k.\ -\- 



in probability. 


(3.10) 


3.2 II-PA model (second preferential attachment model) 

1. Mathematical description: In |16| a different model is analyzed. In that paper it 
is called Yule model and described in discrete time. The model is defined also as a 
preferential attachment model but in this case at each time step n a new vertex is 
added with exactly m + 1 directed edges, m £ Z"*". These edges start from the new 
vertex and are directed towards any of the previously existing vertices according to a 
preferential attachment rule. To define formally a random graph process, we can think 
for a moment at an increasing time rescaled by l/(m + 1) so that at each unit of time 
n, m + 1 scaling time steps happen. Let {G\n)t>i be a random graph process such that 
for all n € Z"*" U {0}, 

(a) at time t = n(m + 1) + 1 add a new vertex Un+i with a directed loop (it does count 
one for the in-degree), and 

(b) for f = 2,... , m -I- 1 at each time t = n{m -|- 1) -I- f add a directed edge from Vn+i to 
Vj, 1 < j + 1, with probability 

= (3.11) 

Note then that {G\n)t>i starts at time 1=1 with a single vertex and one directed 
loop. 

In Figure [5] we illustrate the growth law of this graph. 



Figure 2: Construction of (G(^)t>i for m = 2. (a) Begin at time 1 with one single vertex and a directed loop, 
(b) Suppose some time has passed, in this case, the picture corresponds to a a realization of the process at time 
4 = 3. Keep in mind that here m = 2 and therefore m = 2 directed edge are added to the graph by preferential 
attachment rule (but at this point the only possible choice is the vertex ui). (c) Here time is 4 = 5. A new vertex V 2 
already appeared at time 4 = 4 together with a directed loop. At time 5 instead the first of the m edges that must 
be added to the graph is chosen (red dashed directed edges) by means of the preferential attachment probabilities 

IHTTIl . 


2. Historical context: In [TS], Newman describes this model in terms of genus and species 
as follows. 

Species are added to genera by “speciation”, the splitting of one species into 
two, [...]. If we assume that this happens at some stochastically constant rate, 
then it follows that a genus with k species in it will gain new species at a rate 
proportional to k, since each of the k species has the same chance per unit time 
of dividing in two. Let us further suppose that occasionally, say once every m 
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speciation events, the new species produced is, by chance, sufficiently different 
from the others in its genus as to be considered the founder member of an 
entire new genus. (To be clear, we define m such that m species are added 
to preexisting genera and then one species forms a new genus. So m + 1 new 
species appear for each new genus and there are m + 1 species per genus on 
average.) 

This description is linked to the model proposed by Simon; the difference is that the 
original Simon model does not fix m speciation events, instead it assumes that the 
number of speciation events is random and follows a probability distribution Geo(Q;), 
with 0 < Q < 1. 


3. Available results: Note that the number of vertices with in-degree equal to k is equiv¬ 
alent to the number of genera that have k species, thus, the number of vertices with 
in-degree equal to k at time t = n(m -|- 1), corresponds to the number of genera that 
have k species, when the number of genera is n. 

Let Nk,t be the number of vertices with in-degree equal to k in {Gln)t>i. In nni an 
heuristic analysis of the II-PA model shows that the proportion of vertices that have 
exactly in-degree k, with respect to the total number of vertices at time t = n{m + 1), 
is in the limit 


lim 

n ->-oo 


Nk.t 

n 


{1 + l/m)r{k)r{2 + 1/m) 
r{k + 2 + l/m) 


(1 -1- l/m)B{k, 2 + 1/m), 


(3.12) 


We prove this in Theorem 14.21 where the result is obtained with probability one. 


3.3 Price model 


1. Mathematical description: In |15| . the Price model is described as a random graph 
process in discrete time {G/n)n>i, so that G(), is a direct graph and the process starts 
at time n = 1 with a single vertex, ui, and Mi -|- ko directed loops, where fco > 0 is 
constant and Mi a random variable with expectation m. New vertices are continually 
added to the network, though not necessarily at a constant rate. Each added vertex 
has a certain out-degree, and this out-degree is fixed permanently at the creation of the 
vertex. The out-degree may vary from one vertex to another, but the mean out-degree, 
which is denoted m, is a constant over time. Thus, given GJ), form G’/f)'^ by adding a 
new vertex Vn+i with ko directed loops, and from it a random number of directed edges, 
Mn+i to different old vertices with probability proportional to their in-degrees at time 
n, i.e., 

P(i;„+i —> Vj \ Ml = mi,... ,Mn = mn) = -, 1 < j < n, (3.13) 

nko + rui 


where Mi,... ,Mn+i are taken independent and identically distributed, with E(Mi) = 
m, and m a positive rational number. Note that in this model, the update of the 
probabilities (13.1311 every single time an edge is added, is not taken into account. 

2. Historical context: In [7], Price describes empirically the nature of the total world 
network of scientific papers, and it is probably the first example of what is now called 
a scale-free network. In [8], he formalizes a model giving rise to what he calls the 
cumulative advantage distribution. He finds a system of differential equations describing 
the process, and solves them under specific assumptions. All the derivations are made 
for ko = 1. 

3. Available results: Let Nk n be the number of vertices with in-degree equal to k in 

n>i. Newman analyzes this model by using the method of master-equations 
for the case fco = 1, and finds the same system as in m for the analysis of the II-PA 
model. Thus, he obtains the same limit solution for the proportion of vertices with 
in-degree k, as in the II-PA model, i.e.. 


lim 


n 


•oo 


Nk,n 

n 


(l + l/m)r(fc)r(2-H/m) 
r(fc - 1 - 2 - 1 - l/m) 


(1 -I- l/m)B(k, 2 -|- 1/m). 


(3.14) 


A rigorous analysis of (13.141) can be made using Chebyschev’s inequality and following 
the same lines as in the proof of Theorem 14.21 for the II-PA model (see Section lTl.il) . 
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3.4 Barabasi—Albert model 


1. Mathematical description: In [3], Bollobas, Riordan, Spencer, and Tusnady make 
the Barabasi and Albert model precise in terms of a random graph process. We follow 
their description in this paragraph. Add at each time step a new vertex with m, m £ , 

different directed edges. For the case m = 1, let (Gi)t>i be a random graph process 
so that Gi is a directed graph which starts at time t = 1 with one vertex vi and one 
loop. Then, given Gl form G\^^ by adding the vertex vt+i together with a single edge 
directed from Vt+i to Vj, 1 < i < t + 1, with probability 

= ^‘+1 . (3.15) 
I 2 I+T. J=i + 1- 


For m > 1 define the process (G)„)t>i by running the process (Gi) on the sequence 
of imaginary vertices v[,V2,. ■., then form the graph G^ from G)"* by identifying the 
vertices Vi,V2,. . . , v'^ to form vi, u(„+2 .. ■, V2m to form V2 and so on. 

We can also define this model in a similar manner as we did for the II-PA model. Think¬ 
ing once more that the time increases with a scaling of l/(m-|- 1), then let us define the 
process {Gln)t>i, such that for every n € Z"*" U {0}, 

(a) at time t = n(rn -1- 1) -I- 1 add a new vertex Vn+\, 

(b) for i = 2,..., m -I- 1 at each time t = n(m -I- 1) -I- i add an edge from Vn+i to v, 
where v is chosen with 


P(Un-|-l 


d(v,t—l) 
2(mn+i—1) —1 ’ 
d(v,t-l) + l 
2(mn+i—1) —1 ’ 


V Vn + l, 

V = Vn + l- 


(3.16) 


Observe that (G^)t>i starts at time t = 1 just with a single vertex, without loops. 

2. Historical context: Barabasi and Albert, in [T] proposed a random graph model of the 
growth of the world wide web, where the vertices represent sites or web pages, and the 
edges links between sites. In this process the vertices are added to the graph one at a time 
and joined to a fixed number of earlier vertices, selected with probability proportional 
to their degree. This preferential attachment assumption is originated from the idea 
that a new site is more likely to join popular sites than disregarded sites. The model is 
described as follows. 

Starting with a small number (mo) of vertices, at every time step add a new 
vertex with m{< mo) edges that link the new vertex to m different vertices 
already present in the system. To incorporate preferential attachment, assume 
that the probability that a new vertex will be connected to a vertex i depends 
on the connectivity ki of that vertex, so it would be egual to kj. Thus, 

after t steps the model leads to a random network with t + mo vertices and mt 
edges. 

To write a mathematical description of the process given above it is necessary to clarify 
some details. First, since the model starts with mo vertices and none edges, then the 
vertices degree are initially zero, so the probability that the new vertex is connected to 
a vertex i, 1 < i < mo, is not well defined. Second, to link the new vertex to m different 
vertices already present, it should be necessary to repeat m times the experiment of 
choosing an old vertex, but the model does not say anything on changes of attachment 
probabilities at each time, i.e. it is not explained if the m old vertices are simultaneously 
or sequentially chosen. These observations were made by Bollobas, Riordan, Spencer, 
and Tusnady in [3], where after noted the problems in the Barabasi-Albert model, they 
give an exact definition of a random graph process that fits to that description. 

3. Available results: In [T], Barabasi and Albert obtain through simulation that after 
many time steps the proportion of vertices with degree k obeys a power law Gk where 
G is a constant and 7 = 2.9 ± 0.1, and by a heuristic argument they suggest that 7 = 3 . 
Let Nk,t be the number of vertices with degree equal to k in {Gln)t>i. In [3] Bollobas, 
Riordan, Spencer, and Tusnady analyzed mathematically this model. Their first result is 
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that, for t = n(m+l), i.e., when the total number of vertices is n, and m <k < 
(the bound k < m + is chosen to make the proof as easy as possible), 

ENk,t 2m{m + 1 ) 
n k{k + l)(fc + 2 ) 


uniformly in k. 

The authors consider J^s, the cr-field generated by the appearance of directed edges 
up to time s, s < t, and define Zs = ¥.{Nk^t\Fs) and see it is a martingale satisfying 
\Zs — Zs-i\ < 2, Zt — Nk,t and Zo = ¥,Nk,t, k = 1 , 2 ,... (at time t — 0 the random 
graph is the empty graph). Using Azuma-Hoeffding inequality (12.111 they obtain that 


Nk,t 

n 


EAf, 


k,t 


> 


x/lnt/t) < exp ^ —> 0 , 


as t goes to infinity. Hence, it follows that, for every k in the range m < k < m + 


n 


oik, 


in probability. Thus, the proportion of vertices with degree k, 

Ni. t 

-^- > mim + l)B(k, 3) 

n 

in probability as t —> oo. Note that 


2m{m + 1 ) 
fc(fc+ l)(fc + 2 ) 


m{m + l)B{k, 3). 


(3.17) 


Furthermore, since the Beta function satisfies the asymptotics B{x,y) —> x~'^ for x 
large enough, then Nk^t/n ~ m(m + l)fc“® as k —> oo and obeys a power law for 
large values of k, with 7 = 3 as Barabasi and Albert suggested. Hence, it is proved 
mathematically that when vertices are added to the graph one at a time and joined to a 
fixed number of existing vertices selected with probability proportional to their degree, 
the degree distribution follows a power law behavior only in the tail (for k big enough), 
with an exponent 7 = 3. A second proof of this result is given in [23] (see Theorem 8.2). 


3.5 Yule model 

Differently from the previous models, this model evolves in continuous time. We do not 
describe this model in terms of random graph processes, however in subsection 14.21 we discuss 
its relation with Simon model and conclude that the Yule model can be interpreted as a 
continuous time limit of Simon model, a model with a random graph interpretation. 

1. Mathematical description: In the description of the Yule model we use T to denote 
continuous time, instead of t that denotes discrete time, i.e. T £ R’^ U {0} and t £ Z"*". 
Consider a population starting at time T = 0 with one individual. As time increases, 
individuals may give birth to new individuals independent of each other at a constant 
rate A > 0, i.e., during any short time interval of length h each member has probability 
Xh + o{h) to create an offspring. Since there is no interaction among the individuals, 
then if at epoch T the population size is k, the probability that an increase takes place 
at some time between T and T + h equals k\h + o{h). Formally, let N{T) be the number 
of individuals at time T with N{0) = 1, then if N{T) = k, k > 1, the probability of a 
new birth in (T, T + h) is kXh + o{h), and the probability of more than one birth is o(h), 
i.e.. 


{ kXh + o{h), ^ = 1, 

o{h), l>l, 

1 — kXh + o{h), £ = 0 . 

Thus, {N{T)}t>o is a pure birth process and with the initial condition P(A1(0) = k) = 
Sk,i', this linear birth process is called the Yule process. 
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Consider now two independent Yule processes, {N 0 {T)}t>o and {Nx{T)}t>o, with pa¬ 
rameters /3 > 0 and A > 0 respectively, such that when a new individual appears in the 
process with parameter /3, a new Yule process with parameter A starts. In a random 
graph context, a Yule model can be characterized through Yule processes of different 
parameters as described in the following. The first Yule process denoted by {Np{T)}t>o, 
/3 > 0, accounts for the growth of the number of vertices. As soon as the hrst vertex is 
created, a second Yule process, {N\{T)}t>q, A > 0, starts describing the creation of in¬ 
links to the vertex. The evolution of the number of in-links for the successively created 
vertices, proceeds similarly. Specifically, for each of the subsequent created vertices, an 
independent copy of {Nx{T)}t>o, modeling the appearance of the in-links is initiated. 
Let us dehne Yq = 0 and for fc > 1, 

n = inf{T: Nx{T)^k+l}, 

so that Yk is the time of the fcth birth, and = Yc — Y;_i is the waiting time between 
the (fc — l)th and the fcth birth. In a Yule process it is well-known that the waiting 
times , fc > 1, are independent, each exponentially distributed with parameter Afc. 
Conversely, it is possible to reconstruct {Nx{T)}t>o from the knowledge of the Wj , 
J > 1, by defining 

k 

Yk = Y^ W;, Nx{T) = min{fc: Yk > T}. (3.18) 

t=i 

Thus if the W* are independently distributed exponential random variables, of parameter 
\j, then {Nx{T)}t>o is a Yule process of parameter A. 

2. Historical context: Yule in [23] observed that the distribution of species per genus in 
the evolution of a biological population typically presents a power law behavior, thus, 
he proposed a stochastic model to fit these data. In the original paper [23| the process 
is described as follows: 

Let the chance of a species “throwing” a specific mutation, i.e., a new species 
of the same genus, in some small assigned interval of time be p, and suppose 
the interval so small that p^ may be ignored compared with p. Then, putting 
aside generic mutations altogether for the present, if we start with N prime 
species of different genera, at the end of the interval we will have N{1 — p) 
which remain monotype and Np genera of two species. The new species as well 
as the old can now throw specific mutation. 

Yule proceeded to the limit, taking the time interval AT as indehnitely but the number of 
such intervals n as large, so that nAT = T is finite, and he wrote p = AAT = AT/n. Yule 
not only studied this process. In [24], he furthermore constructed a model of evolution 
by considering two independent Yule processes, one for species with a constant rate 
A > 0 and the other for new genera (each of them composed by a single species) created 
at a constant rate fi > 0. In other words, at time T = 0 the process starts with a 
single genus composed by a single species. As time goes on, new genera (each composed 
by a single species) develop as a Yule process of parameter (3, and simultaneously and 
independently new species evolve as a Yule process with rate A. Furthermore, since a 
new genus appears with a single species, then each time a genus births, a Yule process 
with rate A starts. 

3. Available results: Let Ng{T) and Ns{T), T > 0, be the counting processes measuring 
the number of genera and species created until time T, respectively. It is well-known 
that the probability distribution of the number of individuals in a Yule process with 
parameter A is geometric, Geo(e“^'^). Thus, the distribution of the number of species 
Ns{T) in a genus during the interval of time [0, T] is 

P(A4r) = fc) = e"^'^(l - e“^^)'‘"\ fc>l, r>0. (3.19) 


On the other hand, it is also known that by conditioning on the number of genera present 
at time T, the random instants at which creation of novel genera occurs are distributed 
as the order statistics of iid random variables with distribution function 


P(r < r) 


- 1 

g/3i _ I ’ 


0<r <t 


(3.20) 
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(see [T3] and the references therein). The authors in [T3] take into account that the 
homogeneous linear pure birth process lies in the class of the so-called processes with 
the order statistic property, see m, and use [HI nil HU and [221 to get (13.2011 . 

Thus, let A/t be the size of a genus chosen uniformly at random at time T. Then, 


P(AfT = k)= V{Ns{T) = k\Ns{T) = l)P(r e dr) 
Jo 


,PT _ I 


dr 


1 - e-I^T 


-Py„->'y 


(1 _ 


(3.21) 


The interest now is in the limit behavior when T — > oo; 


lim P(A/t = fc) =/? / e ^^(1 —e ^dy. (3.22) 

T-^oo 

Letting p = /3/A it is possible to recognize the integral as a beta integral to obtain (see 
[21], page 39) 


lim P(A/t = 

T ->oo 


fc) = p 


r(fc)r(i + p) 

r(A: -I- 1 -I- p) 


pB{k, 1 + p), 


k>l. 


(3.23) 


4 Main Results 


4.1 Relations between the fonr discrete-time models 


In [H, Bornholdt and Ebel pointed out that the asymptotic power law of the Barabasi-Albert 
model with m = 1 coincides with that of the Simon model characterized by a = 1/2 (see 
(l3T0ll and (ISTfll L From this observation they suggested that for m = 1 Barabasi-Albert 
model could be mapped to the subclass of Simon models with a = 1/2. 

We think that (13.1011 and (mri) cannot be compared since they give the asymptotic of 
different random variables. Indeed the first refers to the in-degree and the second to the 
degree distribution. However we do believe that may exist a relation between these two 
models, which needs to be explained. 

In this section we discuss rigorous arguments that allow us to clarify the relation between 
Simon and Barabasi-Albert models. To this aim we make use also of the II-PA model. We 
first relate Barabasi-Albert and II-PA models and then II-PA and Simon models. This double 
step is made necessary by the different quantities described by these models. 

Now, we are ready to formulate our results. 

Theorem 4.1. Let m = 1. Then, the in-degree distribution of the II-PA model and the degree 
distribution of the Barabasi-Albert model at time t, t > 2, are the same, i.e., if at time t there 
are n vertices in the proeesses, then for any k G Z"*", 


Ni 


Nt 


where and denote the number of vertices with in-degree and degree equal to k in 

(Gj) and (G\) at time t, i.e., in the II-PA and Barabasi-Albert models, respectively. 


Proof. We follow the mathematical description of the II-PA and Barabasi-Albert models in 
terms of the random graph processes (G^)t>i and (G(„)t>i, presented in Sections 13.21 and 
13.41 respectively. Let us divide each unit of time in two sub-units. At each instant of time 
t = 2n -I- 1 a new vertex Vn+i is created in both models; in the II-PA model this vertex is 
created together with a directed loop. Furthermore, at each time t = 2n-(-2 = 2(n-|-l) an edge 
(a directed edge in the II-PA model) is added from Vn+i to Uj, / < n -I- 1, with probabilities 
given by (I3TT1) and (13.161) for the II-PA and Barabasi-Albert models, respectively. Hence our 
thesis corresponds to show that (13.111) and (13.161) coincide under our hypotheses. 
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We see that the denominator for both probabilities (nmi and (muD is 2n+l, and although 
the two numerators count different quantities, the in-degree for the II-PA and the degree for 
Barabasi-Albert models, their values also coincide. This is easy to check when Vj = Vn+\ and 
the directed edge created at time t = 2(n -|- 1) is to Vn+i- In fact the numerators of (ICTl 
and (l3dTl) become both one. Let us now show that the two numerators coincide also when 

Vj Vn+l- 

Let us suppose Vj Vn+i, and let t = 2(n-|-1) — 1 = 2n-|-1. Observe that in the Barabasi- 
Albert model the degree of Vj at time t = 2n-f 1, d{vj, 2 n+l), j G {1, 2,..., n}, is the sum of 
the number of incoming edges from time t = 2 j + 1 (when Uj+i is added) to time 2n + 1, plus 
the degree corresponding to the edge added at time t = 2 j from vj , that is two if the edge was 
a loop and one otherwise. On the other hand, in the II-PA model the in-degree of Vj at time 
t = 2n + 1, d{vj, 2n -|- 1), j £ {1, 2,..., n} is the sum of the number of incoming edges added 
in the interval of time t £ [2j -|- 1, 2n-f 1] (so this part coincides with Barabasi-Albert model), 
plus the in-degree corresponding to the directed edge added at time t = 2 j from Vj. Thus, if 
it is a directed loop to Vj, the in-degree of Vj at time t = 2 j is two (since when Vj appeared, 
it did together with a directed loop), otherwise the in-degree is one. This concludes the proof 
and (13.111) and (13.161) coincide. □ 

Remark 4.1. The proof of Theorem EH enlightens the advantage given by the re-definition 
of existing models in terms of random graph processes. In particular this reading shows im¬ 
mediately that the two models can be related only when m = 1. 

Remark 4.2. The in-degree distribution of the II-PA model and the degree distribution of the 
Barabasi-Albert model are different when m > 1. In fact take for example m — 2 and suppose 
the first directed edge from Vn+i is not a loop, i.e., a vertex Vj, 1 < j < n is chosen. Then, 
at time t = 3n-\- 1, d{vn+i,Sn -|- 1) = 1 in the II-PA model, while d{vn+i,3n -|- 1) = 2 in the 
Barabasi-Albert model. Thus at time t = 2>n -\- 2, TSA1\) and Tsjm are different because the 
corresponding numerators differ. 

Next we discuss the relationship between Simon and the II-PA models, which allows us to 
relate Barabasi-Albert and Simon models. Before writing such a result, observe the following 
fact. Let Yi be a random variable that counts the number of direct edges originated in the 
Simon model by the ith vertex Vi, until the appearing of the {i -\- l)th vertex. Note that Yi 
follows a Geometric distribution with parameter a. So, if a = l/(m -|- 1), then ETi = m, and 
that is the number of out-going links from a vertex in the the II-PA and Barabasi-Albert 
models. 

What we will establish in the following theorem is that the asymptotic in-degree distri¬ 
bution of the II-PA and Simon models coincide when a — l/(m -|- 1). To do that, first we 
introduce the following definition. 

Definition 4.1. We say that a vertex Vi appears “complete” when it has appeared in the 
process together with all the directed edges originated from it. Thus, at time t = n{rn-\- 1), the 
II-PA model has exactly n “complete” vertices. 

Now we are ready to enunciate the theorem. 

Theorem 4.2. Let m £ lA” be fixed. If {Gln)t>i is the random graph process defining the 
II-PA model, and the number of vertices with in-degree equal k, k > 1, at time t in 

(Gi,) ih/GTlj Qjt ti/TTlG. t -—- 

(l + l/m)r(fc)r(2 + l/m) 

n r{k -1-2-1- 1/m) 


almost surely as n —> oo. 

Remark 4.3. Observe that by (EIF and 13. lU) if a = If (m -|- 1) in the Simon model, then 
the asymptotic in-degree distribution of Simon and II-PA models coincide. Moreover, from 
Theorem and 13. lU) . it follows that the asymptotic degree distribution of Barabasi- 

Albert model coincides with the asymptotic in-degree distribution of Simon model only when 
m = 1 and a = l/(m -1- 1), so that a = 1/2. We conjecture that some other properties of 
Barabasi-Albert model when m = 1, for example the diameter, should be also related with the 
analogous features of Simon model when a = 1/2. 


12 


















Remark 4.4. Observe that coincides with urm . Thus, the previous theorem gives a 
rigorous formalization to the heuristic result in El- 

Remark 4.5. Theorem 14-^1 can be compared with the recent model-free approach of Os- 
troumova, Ryabchenko and Samostav (see Section 3, Theorem 2 in m, with A = m/{m + 1) 
and B — 0). However, in that work a preferential attachment •rule proportional to the degreeis 
considered, and Theorem 2 in m makes use of the initial condition that the degree of an 
existing vertex should be at least egual to m. Instead, in the II-PA model, it is considered a 
preferential attachment rule proportional to the in-degree with the initial condition that the 
in-degree of an existing vertex should be at least equal to one. Therefore, the 11-PA model 
does not fit into the general setup of Theorem 2 in m and we cannot directly apply it to get 
the result of Theorem \4.2\ given in this paper. We believe however that, following these new 
ideas, but considering the in-degree and the corresponding initial condition we can obtain the 
asymptotic in-degree distribution for the II-PA model. 

However, in this paper we use the master equations approach for consistency with the 
theory used to study Simon model. 

Before proving Theorem 14.21 we need to prove the following lemmas. 

Lemma 4.1. Let r,s,t G lA and fo G R such that \b/r\ < 1, then 

r=s + l 

Proof. Since \b/r\ < 1, then using Taylor expansion for ln(l — b/r) we get 

ri 

r=s + l r=s + l '' 

Now, by Euler-Maclaurin it is possible to obtain that (see m) 

2- EU^ = k- 2 j;^dy. 

Using these expressions and the fact that y — 1 < [yj < y, where \jj\ indicates the integer 
part of y, we obtain 

t 

1 1 t — s 1 , , 

Int —Ins-< > -<lnt —Ins, 

st ^' r 

r=l 

t — S A — S^ 1 t — s 

st {stA y2 ^ gf ’ 

r=l 

or, X]r=i !/»■ = Int-lns- |(5i|, where |5i| < {t-s)/{st), and 
where \ 62 \ < (A — s^)/(st)^. Thus, 

ri 

r=s + l 

□ 




(4.2) 

(4.3) 



Lemma 4.2. Let and N{k,n) denote the number of vertices with in-degree equal k, 

k > 1, at time t, and the number of vertices with in-degree k when there are exactly n complete 
vertices in the II-PA model, respectively. Then, 

• for m = 1 and k = 1, 


EAi(l, 


(l ^ ) \ ( 

1 1 ^ 

V {m+l){n + l)-l) \ 

(m + l)(n+ 1) — 1} 




(4.4) 
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• for m > 1 and fc = 1, 

EN(1,„ + 1) = 1 + (l - + o(1) ; («) 

• for m = 1 and k >2, 

'^KN{k,n) 
fc,n); (4.6) 


EiV(fc, n + 1) = 


{k - l)EN(k - l,n) 
(n + l)(m + 1) — 1 
{k-l)KN{k- l,n) 
2(n + 1) - 1 


+ 1 - 


+ 1 - 


k 


{n + l)(m + 1) — 1 
k 


2(n + 1) - 1 


EN{i 


• /or m > 1 and k > 2, 


EN{k, n + 1) 


(fc — l)mEN{k — 1, n) 
(n + l)(m + 1) — 1 


+ 1 - 


km 


(n + l)(m + 1) — 1 


^EiV(fc, 


n) +0 


(4.7) 


Proof. Let m — 1 and k = 1 we start at time t — (m + l)n = 2n, i.e., when there are exactly 
n complete vertices. To see what happens when exactly {n + 1) complete vertices appear, we 
need to check what happens in two steps of the process, at time 2n +1, when deterministically 
appears a new vertex with a directed loop, and at time 2n + 2 = 2(n+1), when a new directed 
edge is added by preferential attachment, and the last vertex added becomes complete. Thus, 
conditioning on what happens until time f + 1, we have 


E(^")+2) = E {Niy^ + 


= 1 - 


t + 


t) + ( 






t + i 


t + i 


^E(fV" 


II-PA\ 
t ) 


(4.8) 


Thus, if N{k, n) denotes the number of vertices with in-degree k when there are exactly n 
complete vertices in the process, then we can write the previous equation as (1131). 

Let m > 1 and fc = 1. Now we need a bit more attention, since we have to consider two 
different situations, when t is multiple of (m -|- 1) and when t is not. In the first situation 
t has the form t = n(m -1-1), so we are in the instant of time when there are exactly n 
complete vertices, and as we did above, to see what happens later we check what happens 
in the two subsequent steps of the process, at time n{m -|- 1) -|- 1 when a deterministic event 
happens, a new vertex with a directed loop appears, and at time n(m-|-1) -|-2 when something 
probabilistic happens, a new directed edge is added by preferential attachment. In the first 
case equation (SSI) still holds. In the second situation observe that if m > 1, in order to see 
complete the vertex added at time t = n(m -|- 1) -|- 1, we have to check what happens from 
n(m -I- 1) -I- 1 until n{m -|- 1) -|- (m -|- 1) = (n -|- l)(m -|- 1), when this vertex becomes complete. 
Thus, when t is not multiple of (m -|- 1) we have the following equation. 


EiNlY'i) = E 


NlY^ 




{Niy^ 




Niy^ 


= (l- i)E(iVi" 


II-PAx 
t )‘ 


(4.9) 


Now, we may use simultaneously (HSl) and (SSI) to get the corresponding equation of what 
happens in (m -|- 1) steps of the process. We start at time t — {n + l)(m -|- 1) — 1, so at time 
t + 1 the process will have exactly (n -|- 1) complete vertices, and since t is not multiple of 
(m -I- 1), we need to begin using (14.91) (m — 1) times, and then use (14.81) . Iterating m times, 
we obtain 




m — l 

1 + i5(iv;;(!t)j n ^ 


j=Q 


1 + i5(fVi"(Pt) 


3' 

n (-0 


r=£ —(m—1) 




l + E{NlY_t) 


(4.10) 
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where we have used in the the last two steps that r = t — j and Lemma l4.ll Finally, using 
the notation N{l,n), and since N{l,n)/n < 1, we get (14.511 . 

The cases k — 2 and k > 2 need to be first considered separately, and in each of these 
we need to analyze when m = 1 and when m > 1. Then we will show that the equations for 
k = 2 and fc > 2 admit a general form, that include the cases k >2. 

Let m = 1. Analogously as we did when m = 1 and k = 1, consider the time t = n(m + 1), 
i.e., when there are exactly n complete vertices. To account for what happens until when 
{n + 1) complete vertices appear, we need to recognize two steps of the process. Indeed, 


= E {Niy^ + 1) 


+ Niy^{i- 

rII-PA\ 


nIY^ +1 


t + 1 
iVn-PA 1 2NY 


+ {Niy^ - 1)- 




t+i 


E{NYt ) + 1 
t -\- 1 


t 1 
2 




and for k > 2, 


II-PAx ,1, /Vfll-PA , , / Vfll-PA 


t 1 

TlI-PA 


+ - 1 )- 


kNi 


i + 1 


, ^II-PA A _ {k-l)NYYt + kNY^^ ^ 

V t + 1 ) 

(fc - 


t+1 






(4.11) 


(4.12) 


Note that in the last line of (SHI) and 632)1, we have replaced with NIY^- In fnct 

if t = n{m + 1), then at time t+1 the process just adds deterministically a new vertex with 
in-degree one, so when fc > 2, NIYA ~ ^lY^’ = ^lY^ I- Using this 

observation we can express SHJ and 6321) as a single equation holding for fc > 2 and m — 1. 
Using the notation iV(l,n) it can be written as (14.61) . 

Let m > 1. once more we need to consider when t is multiple of (m + 1), and when it is 
not. When t = n(m + 1) we obtain again (14.1111 and (14.1211 for k — 2 and fc > 2, respectively, 
while if t is not multiple of (m + 1) and fc > 2 it holds 


E(iV"^^) = E + 




(fc - i)nIYY 


hN^ 

fll-PA 


+ - 1 )- 


+ NlY^{ 


1 - 


(fc - i)NYYt + kNY 


k,t 
II-PA 


(fc - mjNiYY ^(^_k 


(l - -)e(V, 


II-PA\ 
k,t ) 


(4.13) 


Now, in order to get the corresponding equation for what happens in (m+1) steps, i.e., during 
the time interval from when there are n vertices until when there are (n + 1) vertices, it is 
necessary to use 632J and (633 simultaneously. In the same manner as we did for fc = 1, 
we take t = (n + l)(m + 1) — 1, so that at time t+1 the process will have exactly (n + 1) 
complete vertices. Since t is not multiple of (m + 1), we need to begin using (14.131) (m — 1) 
times, and then use (632I). Thus, iterating m times, after some algebra we obtain that for 
any fc > 2, 


m— I 

E(iv^:s^) = 


^ Uk-iMNlYYAYi(i- 


i=0 
m—l 

-[n(>- 

j=o 


t — i 
fc 

t- j 


j^O 


t-j 


E { NlYYm - l )), 


(4.14) 
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where the empty product (i.e., when i = 0) is equal to unity. Let now r = t — j, and since 
i <m and m is fixed, then by Lemma Id.ll we have 


no- 




n (>-^) 


r=£ —(i —1) 



(4.15) 


Moreover, observe that < rn + 1 — i, since at each instant at 

most one edge is added. Thus 


m— 1 

E 


i=0 




t — i 






,t —(m —1) 


d) 


+ 0 


{\y 



(4.16) 


Then using (14.151) and (14.161) and noting that 




< 1, we can write (14.141) as 




(fc-l)mE(A?i(:T:^_(_,)) 


0-^) 




(l). 


(4.17) 


and using the notation N{k,n) we obtain (14.71) . □ 

Theorem 132] gives the limit value to which N{k, n)/n converges when n goes to infinity. 
However, before proving the limit, we need to argue that such limit exists. 

Lemma 4.3. Let N{k,n) be as in Lemma \4-‘A Then, there exist values Ni{k) > 0 and 
7^2 (fc) > 0 such that 


and 


N(k,n) 

n 


Ni{k) 


a.s., 


rhkN{k, n) 
(m + l)n 


N2{k) 


a.s. 


Proof. We make use of supermartingale’s convergence theorem (see [5], Theorem 35.5) and 
equations (031, (H31) . (ITO and (ITTl) . Consider first (ITil) and (14.51) and observe that since 
N{l,n)/{{n + l)(m -1- 1) — 1) < 1, 

EiV(l,n + l) <EiV(l,n) + l, (4.18) 

while for (ITO and (FTl) . 

EiV(fc,n + l) <EiV(fc,n) + l + o(^^^. (4.19) 


Let Hn be the filtration generated by the process {N{k,n), N{k — l,n)}„ until time n, i.e.. 
Tin ■= a{N{k, j), N{k — 1, j); 0 < j < n). If k = 1, let Z{l,n) = (fV(l, n) — n)fn, then by 

( 033 ), 


E[Z(l,n+ 1) I Ur,] < 


Ai(l,n) + 1 — (n + 1) ^ Ai(l,n) + 1 — (n + 1) 


n + 1 


= Z(l,n), (4.20) 


as N{l,n) is Hn-measurable. Hence, {Z(l,n)}„ is a supermartingale and in order to apply 
supermartingale convergence theorem to {Z(l,n)}„, it remains to prove that 


supE(|Z(l,n)|) < oo. 


This is true as 

E(|^(l,n)|) =E[E(|Z(l,n)||7^„_i)] < i(EiV(l,n - 1) + 1 + n) < oo, (4.21) 
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having used that N(k,n)/n < 1, for any n > 1. 

When k > 2, i.e., for (14.61) and (14.711 . note first that if /(n) = 0{k/n), then there exists 
M > 0 such that \f{n)\ = Mk/n + |5|, where |(f| < fc/n. Since k/n < 1, then there exists M 
such that |/(n)| < M + 1. Thus, take Z{k, n) = [N{k, n) — n{c + l)]/n, with c = M + 1, then 
by (14.181) we also get that {Z{k,n)}„ is a supermartingale, and similarly as we did above we 
also show that sup^ E(|Z(fc, n)|) < oo. In this manner we have proved that Z{k,n) converges 
almost surely, thus N{k, n) jn converges almost surely. In perfect analogy we can prove that 
TnkZ{k,n)/(m + 1) converges almost surely, and thus obtain that mkN{k, n)/n{m + 1) also 
converges almost surely. □ 


In order to determine such a limit, we still need to prove the following lemma. 
Lemma 4.4. Letpk~\imn _kxj EiV(fc, n)/n. Then, 

(l + l/m)r(fc)r (2 + l/m) 


r(fc + 2 + l/m) 

Proof. Observe that for a function f{k), 
mfjk) _ mf{k) 


k> 1. 


(4.22) 


{n + l)(7Ti + 1) — 1 n(m + 1) 


1 - 


(n + V)(m + 1) — 1/ n(m + 1) 


+ ,4.23) 

m + 1) \ ) 


By using (14.231) we can write the equations (14.41) . (14.51) . (14.61) and (14.71) as follows. For m = 1, 
k — 


EjV(l,n + l) = l+ (l- ^ ) 

V n(m + l)/ 


for m > 1, fc = 1, 


for m = 1, fe > 2, 


EAf(l,n + 1) = 1 + 1 - 


n(rn + 1) 


)EfV(l,n)+0(^-J, 


EAC(l,n) +0 




(4.24) 


(4.25) 


EfV(fc, n + 1) = 
and for m > 1, fc > 2, 


(fc- l)EAr(fc- l,n) 
n(rn + 1) 


+ 1 - 


n(m + 1) 


)EfV(fc,n)+o(i), (4.26) 


m(fc — l)EfV(fc — l,n) / mfc 

EN{k,n+ 1) = —+ ( 1 - , ^^ jEAr(fc,n) (4.27) 


n(m + ( 


n(rn + 1) 

Looking at (UMI, ISM, SM and SM, we remark that they can be unified as 
EfV(fc,n + 1) = g{k-l,n) + (l “ EiV(fc, n) + <f„. 


(4.28) 


where b = km/{m + 1), g(0,n) = 1, g(k — l,n) = {mk/n{m + l))EN{k,n) for k > 2, and 
(ffn = 0(l/n) if m = 1 and of order 0(k/n) if m > 1. We underline that k could be a function 
of n and hence in general 0(k/n) can be different of 0(l/n). 

Note now that when the first complete vertex appears, it has in-degree equal to (m-l-1), so 
N(k, 1) = 0 for any k yf (m-|-l), and N{m+1, 1) = 1. Iterating (14.281) we have, if fc yf (m-l-1). 


n—i t—i n—i 

EfV(fc,n-bl) = ^(?(fc-l,n-i)]j(l--^) 
i=o 

while, if fc = m -I- 1, 


(4.29) 




EN{k, n + 


n —1 4—1 

1) = ^ g(fc - 1, n - i) I]^ (l - 


j=o 


n-j 


+ 


n—1 n—1 

n(i-;r37)+E'^— (4.30) 


j^O 
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To solve (ICTID . let s = n — i and r = n — j so that 


n 


1 - 


n-j 


n 0-^). 


r=s + l 


then observe that if s < w, 


n 


1- -1 = 
r 


r=s+l r=s+l 

which is equal either to 0 if b = [6J or to 

L6J 


L^J 1 ^ 

n (>-;) n 


r=LbJ+l 


(-i)wn 


ih-i) 


(LbJ-i + 1) 


n 


r=[6J+l 


if b 7 ^ [bj. Applying Lemma [4. 1 1 fnote that to apply this lemma is necessary to have b/r < 1, 
i.e., r > [bJ + 1) we have 

„ fo, s < [bJ, b = [bJ, 

n = s<LbJ,b/L6J. 

Using this and (lO) . formula (14.3011 can be written as 

mNlk.n + 1) = ± oik- i..,(l)‘(i + o(2^)) +o(lh)'‘T.f 


s=[6J 


n ^ 


(4.31) 


s=Li)j 


where the error term S is of order O(lnn) if m = 1 and of order O(fclnn) if m > 1. It is not 
difficult to see that, following a similar procedure, we can get the same solution for 62nii- 

Now, by Lemma 14.31 we know that there exist some Ni{k) > 0 and some N 2 {k) > 0, such 
that N{k,n)/n —> N\(k) and mkN{k,n)/[n{m + 1)] —>■ N 2 {k) almost surely (observe that 
in order to guarantee a.s. convergence, we will need to take k independent of n, hence we 
will obtain Ni{k) and N 2 {k) strictly positive). Thus, by the dominated convergence theorem 
Pk := lim„_ y'EN{k,n)/n, and for fc > 2, g(k — 1) := lim„_> g{k — l,n), exist. 

Note that g{k—l) = m(k— l)pk-i/{m-\-l), and let us write g{k—l,n) = g{k—l) +0{sn.), 
where £n —> 0 as n —> oo. Hence, 


n , 


-1) 
n*' 


n 





E 0{es)s\ 

s=L6J 


(4.32) 


and using that = n*’+^/(b + 1) + o(n^''''^) (see 3.II of [H]), we obtain 

EJV(fc,n + l) m=l, 

n + l , m>l. 


(4.33) 


Observe that when tti > 1 we would need more restrictions on k in order to determine the 
limit of EiV(fc,n + l)/(n + 1). Indeed it should satisfy that klnn/n —> 0 as n —> oo, but 
that is true since we are taking k fixed, i.e., independent of n. Thus, by (033. 


Pk = lim 


EiV(A:, n + 1) 


n->-oo 


n + 1 

Solving (I434ll recursively we get Km . 


m + 1 
2m+l ’ 
m(fc-l)pfc_i 

m(fc4-l) + l ■ 


fc = 1, 
fc > 1 . 


(4.34) 

□ 
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Proof of Theorem \4-^ We follow the approach of Dorogovtsev, Mendes, and Samukhin [5], 
that uses master equations for the expected value of the number of vertices with in-degree k. 
To obtain the exact equations we need to consider two stages. For the first one we consider 
what happens in one step of the process, during which the number of vertices of in-degree 
k can be increased by counting also some vertices coming from those having previously in¬ 
degree (fc— 1) or in-degree (fc-|- 1), and then we consider what happens in (m-|- 1) steps, thus 
obtaining the change of the vertices in-degree in an interval of time starting when the process 
has n vertices, until it has (n + 1) vertices. This part corresponds to finding the equations 
(gZl), (gSl), (HU and g3) given in Lemma [4.2l For the second stage, we iterate the previous 
equations with respect to n and obtain the limit of EAl(fc, n)/n as n —> oo. This part was 
proved in Lemma 14.41 determining (14.221) . 

Finally, we use Azuma-Hoeffding inequality (12.111 to obtain dm. Let be the natural 
filtration generated by the process up to time t. Then, in the same way as it 

was explained for to Simon model. Section 1(1.11 it is easy to show that for s < t, = 

is a martingale such that, < 1, and = 

(at time t = 0 the random graph is the empty graph). Thus by (12.11) we get that for 
every O l/v^, e.g. take e„ = yj\nn/n, 



as n goes to infinity. Here t = n(m -|- 1) -I- i, for i = 0,1,... ,m. Thus we obtain that for 
t = n{m -\- 1), 

(1 + l/m)r(fc)r( 2 -bl/m) 
n r(fc -1-2-1- 1 /m) ’ 

in probability as n —> oo. However, by Lemma 14.(11 we actually have an almost sure conver¬ 
gence. □ 


4.1.1 The Price model 


Let Ml, M 2 ,... be independent and identically distributed random variables with E(Mi) = m, 
where m is a positive rational number and Y{Mi) — a^. Furthermore, let (Gm)n>i be the 
random graph process defining the Price model as in Section [3.31 and take fco = 1. If 
denotes the number of vertices with in-degree equal to k in {G'^)n>i, k> 1, then 


(l + l/m)r(fc)r (2 + l/m) 
n r(fc -1-2-1- 1 /m) 


(4.35) 


almost surely as n —> 00 . 

A rigorous analysis of the previous result can be made using Chebyschev’s inequality and 
following the same lines as in the proof of Theorem 14.21 for the H-PA model. Hence, we limit 
ourselves to present a scheme of the proof. 


1. In the mathematical description of the Price model, we saw that is formed from G)), 
by adding a new vertex Vn+i with fco directed loops, and from it a random number, Mn+i , 
of directed edges to different old vertices. This happens with probabilities proportional 
to their in-degrees as in (13.131) . Conditioning on the number of vertices with in-degree 
k when there are n vertices, we obtain 


= E [E(Arr;,^;i I AT^/r)] 


= E 


1 + (C"“ -1) 


= 1 -bE 


1 - 


Mn+ikoNl^ 

+ Er=l 




fco 


nko -b 


N, 


fco 


M^+ikoNl^V 
'r>‘ko + YTi^i^i). 

(4.36) 
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and, for k > ko, 




; "’" ^ ^ nko + J2tiMi 

^Price 


+ 


_ M„+i(fc - l)jVf-\- __ 

nko + + J 27 =i 


M„+ikN^^r 

M, 


= E 


M^+,{k-l)Nl-% / 

nko + EliM. 


Mn+lk 

nko+ Er=i 


TkrPrice 

jVfc,n 


(4.37) 


2. Take ko — 1, Y = n + Mi and £„ = yTnn/n. By Chebyschev’s inequality, 

2 

PdF — n(l + m)| > ne„] < ^-. (4.38) 

•y/nlnn 

Let X be another random variable such that E(X/Y') is bounded, and 0 < E(X) < 
(fc — l)mn. In addition, define the event En := {n(l + m — e„) <Y < n(l + m + e„)}. 
Conditioning on En and applying (14.381) . 


E(X/F) « E(X/y \ Y £En)+0 


because E(X/F) is bounded. Furthermore, note that 


E(X) 
n(l + m) 


(i - ^ ) < E ( J y £ i?„) < 

V l + m + e„/ Vy / 


+ m + e, 

lE(Jf) 


•^nlr 


E(X) 
n(l + m) 


(4.39) 


fl- - -) 

V 1 + m — e„ / 


thus E {XjY I y £ En) = J^i^ni) 0{yj'lnn/n). Replacing this in (14.3911 we have 




(4.40) 


Using (lOm in (lOITll with X = Mn+iNlT" and in (1071) with X = M„+i(fc- 1)X|’:T;; 
and X = Mn+ikN^)))'^, respectively, we get from (14.361) and (14.371) that 


EX, 


•Price 
feo ,71+1 


1 + 1 - 


n(l + m) 


EXl’j'^® + 0(^/lnn/n), 


(4.41) 


and, for fc > 1, 


^r+rice _ m(fc-l)EXfri“ , mk 

-n(l + m) + I ^ ■ n(l+m) 


EXr”“ + 0(v/lnn/n). (4.42) 


3. Note now that di+n and (ITO) are almost the same as (14.51) and (l+Tl) for the II-PA 
model, respectively. In order to derive (14.351) we then proceed as in the proof of Theorem 
lOl More specifically, to ensure the existence of the limit value of X^O‘^'*/n as n —>■ oo, 
we use supermartingale’s convergence theorem (see [2], Theorem 35.5), in analogy to 
Lemma lOl Then we find that 


lim 

n —>'00 


EXf 


n 


(l + l/m)r(fc)r(2 + l/rw) 
r(fc + 2 + l/m) 


fc > 1, 


(4.43) 


as in Lemma 14.41 Finally by Azuma-Hoeffding inequality (12.11) we obtain 


X 


k,n 


(l + l/m)r(fc)r(2 + l/m) 


fc > 1, 


in probability as n 


n r(fc + 2 + l/m) 

oo. By Lemma [4.31 the result follows almost surely. 


Notice that the Price model is by definition equivalent to the II-PA if Mi = m — 1 almost 
surely. Moreover, Price and II-PA models have the same limit in-degree distribution when 
E(Mi) = m. 
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4.2 Relation between Simon and Yule models 


Bearing in mind the construction of Yule model as explained in Section IH.51 we underline 
that the inter-event times of in-links appearance and those related to creation of new vertices 
are exponentially distributed. In order to relate Yule and Simon models we investigate here 
the inter-event times characterizing Simon model showing that a suitable rescaling in the 
limit leads to exponential random variables. The idea is to identify two different processes 
which conditionally describe Simon model, and clarify how these are related with the two 
Yule processes which define a Yule model. 

The next theorem together with Remarks l4.6l and l4.7l allows us to recognize the first process 
inside a Simon model behaving asymptotically as a Yule process with parameter (1 —a), while 
Theorem and Remark Its] determine the second process which behaves asymptotically as 
a Yule process with parameter equal to one. The first process models how the vertices get 
new in-links, thus at each moment a new vertex appears, a process starts. On the other hand, 
the second process is related to how the vertices appear. 

Let {Ga)t>i be the random graph process associated to Simon model of parameter a, as 
described in Subsection EH and let {d{vi,t)}^y^i be the in-degree process associated to the 

vertex Vi, which appears at time t},, i.e., t}, = min{t: d(vi,t) = 1} (note that d{vi,tQ) = 1 as 
the vertex appears together with a directed loop in the model). 

Our first focus is on the study of the distributions of the waiting times between the instant 
in which each vertex has in-degree k, till that in which it has in-degree fc -|- 1. Formally, we 
study the distribution of the random variables Wl = — t^_i, fc > 1, where = min{t : 

d{vi,t) = i -I- 1} for j = 0 , 1, 2 ,... . 

Theorem 4.3. Let z = In ^-i ) , k > 1, x > Q. It holds 


nwi <x)- wizi < z) 



(4.44) 


where Z\. is an exponential random variable of parameter (1 — a)k. 
Remark 4.6. Theorem \4.S\ states that for any t* large enough but fixed, 


¥{Wl <x)- V{Zi < z) 



(4.45) 


Vj > min{i: tg > t*}, and for fc > 1. This means that from a fixed but large time t*, all the 
waiting times Wf, are approximately exponential random variables, with an error term smaller 
than O (l/t*). 

Proof of Theorem By the preferential attachment probabilities (EH of (Ga)t>l, for X > 
1, we have 

¥\W" = 1 - a) \/^ rfK4l-i)(l-Q) \ T d{vi,tl_i){l 

^ V ti_,+x-i ){ ti_,+x-2 )■■■{ 4-1 

- Q) a 4 _ k{l-a) \ r _ fc(l-Q) \ 

tl_^+x-lJ[ tl_^+x-2j J 




(4.46) 


as 4-1 > fc- Then fc(l — a)/r < 1, so we can apply Lemma l4.1l to the product to obtain 

^l--i+x-2 


n (‘- 


fc(l — a)' 


/ \tl_j_ + x-lJ \ V(4_i 


-i + *-2)(4_i-l) 


(4.47) 
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Thus, using (14.461) . (14.4711 . the Euler-Maclaurin formula, 


n 

V- = — 

As 


[y\ 

,,s + l 




J = 1 


with s € M \ {1} (see [19p and the fact that [y\ < y, we arrive at 




4 i-i 


-i4 + «^4-i 


fc(l-a) + l 


tt;=l 

<(i+o(^))Mi-a)(4-i-i4<^-“> i: (i) 

= (l + o(^))fc(l - a)(4_i - 


(i + o( 


(4-i4 + «^4-i 


+“!“i 




fe(l-a) + l 


((4-1 




+ (fc(l - q) + 1) 


LyJ 


>_i+a:(4_i)''(i-“) 

- (‘+°(in)) 4 “p [ -+744)1-■ 

Thus, we get 

¥{Wl <x)- [l - exp ( - fc(l - a) ln(l + 2;/(4-i - 1)))] < . 

Then, by taking 2 : = ln(l + — 1)), it holds 


7i!(l-c.)+2 


y 

-a) 


dy 


(4.48) 


nwl <x)- V{Zl < z) 


< O 


( 4 )' 


where Z]. is a random variable exponentially distributed with parameter (1 — a)k. □ 

Remark 4.7. Note that the knowledge of {Wl.},k > 1, is equivalent to the knowledge of 
d{vi,t), ffls d{vi,t) ■— min{fc: X)b=i > (^“^ 0 )}- Thus, due to Theorem the process 
{d{vj,t)}^^^j, yj > min{i: tg > to}, behaves asymptotically as a Yule process with parameter 
(1 - a). 

Let us now consider the growth of the vertices in Simon model, where at each instant of 
time t, a new vertex is created with a fixed probability a. This fact can be re-thought from 
a different perspective as follows. Remember that in Simon model the number of vertices at 
time t is a random variable V{t), distributed Binomially, Bin(t, a), and that at each instant 
of time, one and only one vertex can appear. Think for a moment that we know the number 
of vertices at time t, then, conditionally on that, at time t -I- 1 choose uniformly at random 
an existing vertex, i.e., with probability 1/V{t) select one vertex, and with probability a 
duplicate it. Note that, as time increases, each existing vertex may give birth to a new 
vertex with probability a/V(t). In this way we have that a new vertex appears with constant 
probability a\ since there are Vii) vertices, then the probability of the birth of a new vertex 
is V{t)(a/V{t)) = a. 

Now fix a time, take for example tg, the time when the ith vertex appears, so V{t}f) = i. 
For each of the existing vertices at time to, say Vj, !<}<*, define the birth process 
of all its descendants as follows. Start at time Iq with one vertex, Vj. Since at 
time t-l-1 each existing vertex in Simon model may give birth to a new vertex with probability 
a/V{t), then if at time t the number of vertices descendent of Vj (i.e., itself + its children + 
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its grandchildren + etc.) is k, the probability that a new descendent of Vj appears at time 
t + 1 is ka/V(t). Formally, let Dj(t) be the total number of descendent of Vj at time t with 
Dj{to) = 1 (itself), then if Dj{t) = k, k > 1, the probability that a new descendent of Vj 
appears at time t + 1 is ka/V{t). 

Observe that since at each time we are selecting one and only one vertex in Simon model 
to duplicate, the probability of either no duplications or more than two at each instant of time 
t is zero. Clearly this is different from the case in which we had taken independent processes 
, therefore, they are dependent. However, by definition, these processes, are equal 
in distribution, i.e., ¥{Dj{t) < d) is the same for each 1 < j < i. 

We will see in the following theorem that the processes , 1 < j < i, converge 

in distribution to Yule processes with parameter 1, i.e., if at time t, Dj{t) = k, k > 1, 
the number of steps up to see the next descendent of Vj, converges in distribution to an 
exponential random variable with parameter k. Thus, starting with i vertices, we will see that 
from to, the process of appearance of new vertices in Simon model approximates i dependent 
but identically distributed Yule processes with parameter 1. If the interest is to study the 
asymptotic characteristics of a uniformly chosen random vertex in Simon model, we could do 
that first by choosing uniformly at random a Yule process with parameter 1, and then, by 
choosing uniformly at random an individual belonging to it. 

Formally, let (Ga)t>i be the random graph process corresponding to Simon model (de¬ 
scribed in Subsection id.lll and, as above, let tg be the time when the ith vertex appears. Then, 
for each vertex in this process up to time tg, say Vj, I < j < i, let be the random variables 
yl ■“ “ ^k-i’ ^ ~ li 2,..., where ~ fo) 8-nd is the minimum t when there are 

exactly fc -|- 1 descendants of Vj in , k > 1. Hence represents the waiting time 

between the appearance of the fcth and the {k -|- l)th vertex in . 


Theorem 4.4. Let z = ln(l-|-j//(^^_j^ — 1)), k > 1, y > 0, and 0 < £t < 1 such that tef 
as t —> oo. Then, 


nyi <y)- nzi < z) 


< o 




(4.49) 


where Zl, is an exponentially distributed random variable of parameter k. 

Remark 4.8. Since t}, = Iq and il, > k > 1, the previous theorem states that for any 
t* — to large enough but fixed, 


nyi<y)-nzi<z) 


< o 


t*e 


(t*)2 


(4.50) 


In words it means that from a fixed but large time t*, all the waiting times y^ are ap¬ 
proximately exponential random variables of parameter k, with an error term smaller than 
o (i/(t‘4.) 2 )). Thus, fort* large enough we start to see a process which is very close to a 
Yule process with parameter k. 


Proof of Theorem Let us define the Bernoulli random variables I > 1, with P{X^ ^ = 
1) = koLjV-\-1 ) = 1 — V{Xl = 0), so, {Xl g = 1} denotes the event that any of the k 
descendant of Vj in {Dj{t)}g^gi gives birth to a new one at time -\-I- Note that the event 
{yi ~ y} equivalent to the event {Xf, ^ = 0, X^ ^ ~ 0,..., X^ ^ = 1}. Now define the events 
£t ~ {t{a—£t) < V{t) < t{a-\-et)}. By Chebyschev’s inequality we have P(£'(^) < a{l — a)/tet, 
so V{£t) —> 1 if tet —> 00 as t —> 00. Then observe that 


HXlf = a:) ~ V{Xig = x\£g. 




{ti ,+l- 1 )£^- 

V fe-l ) 


and 


y —1 

nyi = y)- [nKy = 11 11 = o i ■ 


(4.51) 
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Assuming that e^j , > £^3 _|_ 3,_2 > ■ ■ ■ > , we obtain that the right side of (14.5111 

fc — 1 fc — 1 fc — 1 

is bounded above by 


ak 


1 - 


ak 


ak 


X X 1 - 




1 ) 


ak 




{ii_i+y-l)ia-e^3 

k-l ^ 

and bounded below by 
ak 


_ TT - V (4.52) 

1) V r{a + er)J ) 


1 - 


ak 


{£i_-,+y-l}{a + s^3 +j,_i) \ i^i-i+y-‘2)ia-e^, ) 

fc — 1" \ fc — 1'" 


/ ak ' 

l 1 \ 

1 



fc-i 


ak 


i£i_i+y-l)ia + s^3 ) 


n (' 


+ y - i ' ^^(3 


ak 


r{a - Er) 


+ 0 


(—!- 


-i4 

fc —1 


(4.53) 


Thus, in a similar manner as we did in the proof of Theorem 14.31 by using Lemma 14.11 and 
Euler-Maclaurin formula to (14.5211 and (14.5311 . we find that 


P(32^ <y)-[l-exp(-fcln(l + y/(£i_,-l)))]| <o(— 


Then, taking z = ln(l + yl(£\_^ — 1)), we obtain 


nyi<v)-nzi<z') 


< o 


fc —1 


where Zl, is an exponentially distributed random variable with parameter which proves the 
thesis. □ 


5 Discussion and conclusions 

To compare the Barabasi-Albert and Simon models, we considered a third model that we 
called here the II-PA model, first introduced in m with a different name. Then we gave a 
common description of the three models by introducing three different random graph processes 
related to them. This representation allowed us to clarify in which sense the three models 
can be related. For each fixed time, if m = 1, we proved that the Barabasi-Albert and 
the II-PA models have exactly the same preferential attachment probabilities (Theorem [TT]). 
Furthermore, since in the first model the preferential attachment is meant with respect to the 
whole degree of each vertex while in the second case it is meant with respect only to the in¬ 
degree, the conclusion is that, for a uniformly selected random vertex, the degree distribution 
in the Barabasi-Albert model equals the in-degree distribution in the Simon model. Note 
that m = 1 is the only case in which this is true. 

Since the direct comparison between Barabasi-Albert and Simon model is not possible 
we first compared II-PA model with Barabasi-Albert model ('Theorem 14.11) . and then II-PA 
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model with Simon model fTheorem 14. 2 II . We underline that, even if the introduction of II-PA 
model was functional to the study of the connections between the Barabasi-Albert and Simon 
models, this hybrid model is interesting in itself. 

Regarding the connections between Simon and II-PA models. Theorem 14.21 shows that 
when time goes to infinity, the II-PA model has the same limiting in-degree distribution as 
that of the Simon model with parameter a = l/(m -|- 1), for any m > 1. The proof uses the 
Azuma-Hoeffding concentration inequality and the supermartingale’s convergence theorem. 

Combining Theorem 14.II and 14.21 we conclude that, in the limit, the Simon model has the 
same in-degree distribution as that of the Barabasi-Albert model, for a = 1/2 and m = 1. 
The existing relations between the three models are summarized in Figure [3] 

On the other hand, Yule model is defined in continuous time. In Section S2] we give 
a mathematical explanation of the reason why, when time goes to infinity the distribution 
of the size of a genus selected uniformly at random in the Yule model coincide with the in¬ 
degree distribution of Simon model. More precisely, we recognize which are the two different 
processes that describe Simon model and how they are related with a Yule model. Theorem 
[Q and Theorem oi show that, as time flows, these two different processes approximates 
the behavior of a continuous time process that in fact corresponds to a Yule model with 
parameters (1 — a, 1). This result is obtained in probability. 

Many other preferential attachment models have appeared in the literature in the last 
years. In for instance, a general model of web graphs is studied. With the right choice 
of the parameters this model includes the Barabasi-Albert model, however, Simon and Yule 
models do not fit into the general set of assumptions considered in [5]. For a discussion 
of several related preferential attachment models see for example |23| . Chapter 8, or m, 
Chapter 4. 



Figure 3: The relations between Simon, II-PA, Price and Barabasi-Albert (B—A in the picture) models. Note 
that II-PA and Barabasi-Albert models can be put in relation for any time t but just in the case m = 1. Instead, 
the connections between II-PA and Simon models and Simon and Barabasi-Albert models, respectively, hold in the 
limit for t going to infinity (w.r.t in-degree or degree distribution). We include also the Price model which is by 
definition equivalent to the II-PA if Mi = m = 1 almost surely. Moreover, Price and II-PA models have the same 
limit in-degree distribution when E(Mi) = m. 
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